day-21-timeseries

Author

Megan Hoover

library(dataRetrieval)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tsibble) #need for the yearmonth class
Registered S3 method overwritten by 'tsibble':
  method               from 
  as_tibble.grouped_df dplyr

Attaching package: 'tsibble'

The following object is masked from 'package:lubridate':

    interval

The following objects are masked from 'package:base':

    intersect, setdiff, union
library(plotly)

Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout
library(feasts) #for subseries
Loading required package: fabletools
# Example: Cache la Poudre River at Mouth (USGS site 06752260)
poudre_flow <- readNWISdv(siteNumber = "06752260",    # Download data from USGS for site 06752260
                          parameterCd = "00060",      # Parameter code 00060 = discharge in cfs)
                          startDate = "2013-01-01",   # Set the start date
                          endDate = "2023-12-31") |>  # Set the end date
  renameNWISColumns() |>                              # Rename columns to standard names (e.g., "Flow", "Date")
  mutate(Date = yearmonth(Date)) |>                   # Convert daily Date values into a year-month format (e.g., "2023 Jan")
  group_by(Date) |>                                   # Group the data by the new monthly Date
  summarise(Flow = mean(Flow))                       # Calculate the average daily flow for each month
GET:https://waterservices.usgs.gov/nwis/dv/?site=06752260&format=waterml%2C1.1&ParameterCd=00060&StatCd=00003&startDT=2013-01-01&endDT=2023-12-31

1. Convert to tsibble

Use as_tsibble() to convert the data.frame into a tsibble object. This will allow you to use the feast functions for time series analysis.

poudre_tsibble <- poudre_flow |> 
  as_tsibble(index = Date)

2. Plotting the time series

Use ggplot to plot the time series data. Animate this plot with plotly

p <- ggplot(poudre_tsibble, aes(x = Date, y = Flow)) +
  geom_line(color = "darkblue", linewidth = 1.2) +
  labs(title = "Interactive Monthly Streamflow - Cache la Poudre River",
       x = "Date", y = "Flow (cfs)") +
  theme_minimal()

ggplotly(p)

3. Subseries

Use gg_subseries to visualize the seasonal patterns in the data. This will help you identify any trends or seasonal cycles in the streamflow data. Describe what you see in the plot. How are “seasons” defined in this plot? What do you think the “subseries” represent?

# Subseries plot
poudre_tsibble |> 
  gg_subseries(Flow) +
  labs(title = "Seasonal Subseries Plot of Streamflow",
       y = "Flow (cfs)") +
  theme_minimal()

The sub_series shows that the monthly flows tend to be highest in May and June.Seasons are typically defined by months, and the months that represent the most flow are spring months. I think this represents the precipitation and snowmelt events thats happen in the spring, causing a higher flow.

4. Decompose

Use the model(STL(…)) pattern to decompose the time series data into its components: trend, seasonality, and residuals. Chose a window that you feel is most appropriate to this data. Describe what you see in the plot. How do the components change over time? What do you think the trend and seasonal components represent?

# Decompose the time series with STL, using a window appropriate for yearly data (12 periods)
poudre_decomposed <- poudre_tsibble |> 
  model(stl = STL(Flow ~ season(window = 12) + trend(window = 24)))

# Visualize the decomposition
components(poudre_decomposed) |> 
  autoplot() +
  labs(title = "Decomposition of Cache la Poudre River Streamflow", 
       subtitle = "Trend, Seasonality, and Residuals")

This plot shows the flow, trend, seasons, and residuals from 2014 to 2014. The trend shows the long-term stream flow patterns, over time it follows the same pattern, but gets smaller over time. The season cycle shows to repeat every 12 months, with some variation in peaks that could be do to climate variables of that year. The residuals capture short term fluctuations that aren’t explain by seasons and trend. There are some times of the year where the residual is large, meaning that the trend and seasonality of the model isn’t capturing all aspects of the data.